1,061 research outputs found

    Fast Ridge Regression with Randomized Principal Component Analysis and Gradient Descent

    Full text link
    We propose a new two stage algorithm LING for large scale regression problems. LING has the same risk as the well known Ridge Regression under the fixed design setting and can be computed much faster. Our experiments have shown that LING performs well in terms of both prediction accuracy and computational efficiency compared with other large scale regression algorithms like Gradient Descent, Stochastic Gradient Descent and Principal Component Regression on both simulated and real datasets

    The Hedge Fund Game

    Get PDF
    This paper examines theoretical properties of incentive contracts in the hedge fund industry. We show that it is very difficult to structure incentive payments that distinguish between unskilled managers, who cannot generate excess market returns, and skilled managers who can deliver such returns. Under any incentive scheme that does not levy penalties for underperformance, managers with no investment skill can game the system so as to earn (in expectation) the same amount per dollar of funds under management as the most skilled managers. We consider various ways of eliminating this “piggy-back effect,” such as forcing the manager to hold an equity stake or levying penalties for underperformance. The nature of the derivatives market means that none of these remedies can correct the problem entirely.incentive contracts, excess returns

    Regret testing: learning to play Nash equilibrium without knowing you have an opponent

    Get PDF
    A learning rule is uncoupled if a player does not condition his strategy on the opponent's payoffs. It is radically uncoupled if a player does not condition his strategy on the opponent's actions or payoffs. We demonstrate a family of simple, radically uncoupled learning rules whose period-by-period behavior comes arbitrarily close to Nash equilibrium behavior in any finite two-person game.Learning, Nash equilibrium, regret, bounded rationality

    Spectral dimensionality reduction for HMMs

    Get PDF
    Hidden Markov Models (HMMs) can be accurately approximated using co-occurrence frequencies of pairs and triples of observations by using a fast spectral method in contrast to the usual slow methods like EM or Gibbs sampling. We provide a new spectral method which significantly reduces the number of model parameters that need to be estimated, and generates a sample complexity that does not depend on the size of the observation vocabulary. We present an elementary proof giving bounds on the relative accuracy of probability estimates from our model. (Correlaries show our bounds can be weakened to provide either L1 bounds or KL bounds which provide easier direct comparisons to previous work.) Our theorem uses conditions that are checkable from the data, instead of putting conditions on the unobservable Markov transition matrix

    A Proof of Calibration Via Blackwell\u27s Approachability Theorem

    Get PDF
    Over the past few years many proofs of the existence of calibration have been discovered. Each of the following provides a different algorithm and proof of convergence: D. Foster and R. Vohra (1991, Technical Report, University of Chicago), (1998, Biometrika85, 379–390), S. Hart (1995, personal communication), D. Fudenberg and D. Levine (1999, Games Econ. Behavior29, 104–130), and S. Hart and A. Mas-Colell (1997, Technical Report, Hebrew University). Does the literature really need one more? Probably not. But the algorithm proposed here has two virtues. First, it only randomizes between two forecasts that are very close to each other (either p or p + ϵ). In other words, the randomization only hides the last digit of the forecast. Second, it follows directly from Blackwell\u27s approachability theorem, which shortens the proof substantially. Journal of Economic Literature Classification Numbers: C70, C73, C53

    Prediction in the Worst Case

    Get PDF
    A predictor is a method of estimating the probability of future events over an infinite data sequence. One predictor is as strong as another if for all data sequences the former has at most the mean square error (MSE) of the latter. Given any countable set D of predictors, we explicitly construct a predictor S that is at least as strong as every element of D. Finite sample bounds are also given which hold uniformly on the space of all possible data

    Variable Selection in Data Mining: Building a Predictive Model for Bankruptcy

    Get PDF
    We predict the onset of personal bankruptcy using least squares regression. Although well publicized, only 2,244 bankruptcies occur in our dataset of 2.9 million months of credit-card activity. We use stepwise selection to find predictors of these from a mix of payment history, debt load, demographics, and their interactions. This combination of rare responses and over 67,000 possible predictors leads to a challenging modeling question: How does one separate coincidental from useful predictors? We show that three modifications turn stepwise regression into an effective methodology for predicting bankruptcy. Our version of stepwise regression (1) organizes calculations to accommodate interactions, (2) exploits modern decision theoretic criteria to choose predictors, and (3) conservatively estimates p-values to handle sparse data and a binary response. Omitting any one of these leads to poor performance. A final step in our procedure calibrates regression predictions. With these modifications, stepwise regression predicts bankruptcy as well as, if not better than, recently developed data-mining tools. When sorted, the largest 14,000 resulting predictions hold 1,000 of the 1,800 bankruptcies hidden in a validation sample of 2.3 million observations. If the cost of missing a bankruptcy is 200 times that of a false positive, our predictions incur less than 2/3 of the costs of classification errors produced by the tree-based classifier C4.5
    corecore